Search Result

Journals

Publication Years

Keywords

Please wait a minute...

For Selected:

Download Citations
EndNote Ris BibTeX

Toggle Thumbnails

Select

Fast high average-utility itemset mining algorithm based on utility-list structure

WANG Jinghua, LUO Xiangzhou, WU Qian

Journal of Computer Applications 2016, 36 (11): 3062-3066. DOI: 10.11772/j.issn.1001-9081.2016.11.3062

Abstract （535）

PDF （722KB）（465）

Save

In the field of data mining, high utility itemset mining has been widely studied. However, high utility itemset mining does not consider the effect of the itemset length. To address this issue, high average-utility itemset mining has been proposed. At present, the proposed high average utility itemset mining algorithms take a lot of time to dig out the high average-utility itemset. To solve this problem, an improved high average itemset mining algorithm, named FHAUI (Fast High Average Utility Itemset), was proposed. FHAUI stored the utility information in the utility-list and mined all the high average-utility itemsets from the utility-list structure. At the same time, FHAUI adopted a two-dimensional matrix to effectively reduce the number of join-operations. Finally, the experimental results on several classical datasets show that FHAUI has greatly reduced the number of join-operations, and reduced its cost in time consumption.

Reference | Related Articles | Metrics

Select

Project keyword lexicon and keyword semantic network based on word co-occurrence matrix

WANG Qing, CHEN Zeya, GUO Jing, CHEN Xi, WANG Jinghua

Journal of Computer Applications 2015, 35 (6): 1649-1653. DOI: 10.11772/j.issn.1001-9081.2015.06.1649

Abstract （1186）

PDF （877KB）（567）

Save

In order to solve the problems of keyword extraction and project keyword lexicon establishment of technological projects in professional fields, an algorithm for building the lexicon based on semantic relation and co-occurrence matrix was proposed. On the basis of conventional keyword extraction research based on co-occurrence matrix, the algorithm considered several advanced factors such as the location, property and Inverse Document Frequency (IDF) index of the keywords to improve the traditional approach. Meanwhile, a method was given for the establishment of keyword semantic network using co-occurrence matrix and hot keyword identification through computing the similarity with semantic base vector. At last, 882 project experiment documents in power field were used to perform the simulation. And the experimental results show that the proposed algorithm can effectively extract the keywords for the technological projects, establish the keyword correlation network, and has better performance in precision, recall rate and F1-score than the keyword extraction algorithm of Chinese text based on multi-feature fusion.

Reference | Related Articles | Metrics